- 
                Notifications
    
You must be signed in to change notification settings  - Fork 2k
 
Added support for the "think" for Ollama #3386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Added support for the "think" for Ollama #3386
Conversation
      1. Added the `think` field to Ollama's `ChatRequest`
      2. Added the `thinking` field to Ollama's `Message`
      3. Added the `think` property to `OllamaOptions`, allowing users to specify whether to enable or disable thinking
Signed-off-by: Sun Yuhan <[email protected]>
    Signed-off-by: Sun Yuhan <[email protected]>
…fault behavior, thereby ensuring compatibility with older versions of Ollama calls. Signed-off-by: Sun Yuhan <[email protected]>
| 
           @tzolov @ilayaperumalg @markpollack Could you please help review this PR? Thank you.  | 
    
…tainer image version of ollama to 0.9.0 Signed-off-by: Sun Yuhan <[email protected]>
| 
           Yes, we will review. Thanks  | 
    
        
          
                models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaOptions.java
          
            Show resolved
            Hide resolved
        
              
          
                models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaApi.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      …tionMetadata`, and added corresponding unit tests. Signed-off-by: Sun Yuhan <[email protected]>
Signed-off-by: Sun Yuhan <[email protected]>
| 
           可以考虑加一个可自由扩展参数的字段,可以灵活应对兼容参数并避免延迟于ollama的更新。  | 
    
| 
           Hi @markpollack , I hope you're doing well! I just wanted to check in and see if there are any further changes needed for this PR. Let me know if there's anything else I should address — happy to make adjustments if needed! Thanks again for your time and review!  | 
    
| 
           I would like to ask: If this PR is merged, will this feature be included in the snapshot version library? Then how can developers who are currently using the GA version use this feature?  | 
    
          
 It seems to me that this feature should be included in 1.1.x, as well as being backport to 1.0.x. By the time the 1.0.x version is officially released, users currently using the 1.0.0 GA version will be able to use it by upgrading to the 1.0.x version  | 
    
…-thinking # Conflicts: # models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaApi.java # models/spring-ai-ollama/src/test/java/org/springframework/ai/ollama/OllamaImage.java
| 
           yes, I will backport it. Sorry about the delay. Only now getting my project mgmt hat on full time @sunyuhan1998  | 
    
          
 no problem at all! Please feel free to let me know if any changes are needed.  | 
    
| 
           i'm still working through this. exploring thinking more generally as an overall feature.  | 
    
          
 No problem. If there's any part I can contribute to, please feel free to let me know. I'd be happy to get involved.  | 
    
| 
           @sunyuhan1998 Sorry about the delay in getting this in. Could you rebase this PR from the latest main please? Thanks  | 
    
Signed-off-by: Sun Yuhan <[email protected]>
Signed-off-by: Sun Yuhan <[email protected]>
…tionMetadata`, and added corresponding unit tests. Signed-off-by: Sun Yuhan <[email protected]>
Signed-off-by: Sun Yuhan <[email protected]>
379ea2c    to
    a897177      
    Compare
  
    Signed-off-by: Sun Yuhan <[email protected]>
Signed-off-by: Sun Yuhan <[email protected]>
…to feat-support-ollama-thinking # Conflicts: # models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaApi.java # models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaApiHelper.java # models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaModel.java # models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaOptions.java # models/spring-ai-ollama/src/test/java/org/springframework/ai/ollama/OllamaChatModelMetadataTests.java # models/spring-ai-ollama/src/test/java/org/springframework/ai/ollama/OllamaImage.java # models/spring-ai-ollama/src/test/java/org/springframework/ai/ollama/api/OllamaApiIT.java
Signed-off-by: Sun Yuhan <[email protected]>
Signed-off-by: Sun Yuhan <[email protected]>
          
 Hi, I've rebased this branch onto the latest main branch and properly resolved all conflicts. I sincerely apologize for the messy commit history in my previous submissions. This was caused by an earlier merge of the main branch into the current branch, which led to some issues during the subsequent rebase process. However, everything has now been cleaned up, and the final code has been restored to a proper state. Additionally, I'd like to clarify: the changes originally included in this PR to add  The PR has now been updated to its latest state and is ready for review. Thank you!  | 
    
| 
           Hi. Thank so much @sunyuhan1998 Regarding sending the previous thinking back, this comes up in other models as well now. We can perhaps stick it in the metadata of the message for now. The design of the messaging has changed, i think we can improve the model in 2.0, most leading models have 'thinking'.  | 
    
| */ | ||
| @Nullable | ||
| default Boolean getThink() { | ||
| return false; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't add 'think' to default chat option as a simple boolean is not sufficient to capture the full range of options across models, e.g. there is often a thinking token budget or other parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, initially this attribute was designed solely to enable or disable Ollama's thinking. If we consider it to represent the entire content of "thinking" and intend to apply it across all models, it should be a complex object. I'm currently considering what properties we should include in it. Do you have any more specific suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics of Ollama are: when not explicitly specified (null) → the server embeds the thought process within the  tags in the content; explicitly false → thinking is disabled; explicitly true → separated into message.thinking.
The current ChatOptions interface provides a default implementation: default Boolean getThink() { return false; }. Although specific implementation classes have overridden this to return nullable values, if any implementation fails to override it in the future, "not set" will be treated as false, thereby altering the default behavior.
| @Override | ||
| @Nullable | ||
| public Boolean getThink() { | ||
| return this.think; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with ChatOptions, the boolean itself isn't sufficient across all model providers.
| assertThat(response.done()).isTrue(); | ||
| assertThat(response.message().role()).isEqualTo(Role.ASSISTANT); | ||
| assertThat(response.message().content()).contains("Sofia"); | ||
| assertThat(response.message().thinking()).isNotEmpty(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assertion isn't passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using the Qwen model, and it works fine on my side. Perhaps we need to adjust this assertion test, as the model's output has some randomness. I'll modify it soon.
| assertThat(responses.stream() | ||
| .filter(r -> r.message() != null) | ||
| .map(r -> r.message().thinking()) | ||
| .collect(Collectors.joining(System.lineSeparator()))).contains("Sofia"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assertion is failing for me.
| assertThat(responses.stream() | ||
| .filter(r -> r.message() != null) | ||
| .map(r -> r.message().thinking()) | ||
| .collect(Collectors.joining(System.lineSeparator()))).contains("solar"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assertion is failing for me.
| 
           Moving to 1.1 RC1 due to the need to move 'think' out of the high level chat options interfaces and the debug the test failures.  | 
    
| 
           、、 
 Hello, may I ask if this feature will no longer exist in version 1.1.0-M4  | 
    
| 
           Hi @shuanggegege , yes, this feature is targeted to get merged in 1.1.0-RC1 hopefully.  | 
    
| 
           OK,I understand, thank you @ilayaperumalg  | 
    
| if (ollamaResponse.promptEvalCount() != null && ollamaResponse.evalCount() != null) { | ||
| generationMetadata = ChatGenerationMetadata.builder() | ||
| .finishReason(ollamaResponse.doneReason()) | ||
| .metadata("thinking", ollamaResponse.message().thinking()) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, metadata is only constructed when both promptEvalCount and evalCount exist, along with "thinking". This results in some responses containing thinking not being recorded due to missing count fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Placing "thinking" directly as a top-level key in ChatGenerationMetadata may conflict with future keys of the same name from other providers. It is recommended to consider using a namespace prefix (e.g., "ollama.thinking") or defining the key name in a public constant to avoid collisions.
| .messages(ollamaMessages) | ||
| .options(requestOptions); | ||
| .options(requestOptions) | ||
| .think(requestOptions.getThink()); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enter null, ensure it is not serialized into a JSON field to trigger the server's default behavior. It is recommended to set it only when non-empty think
Boolean think = requestOptions.getThink();
if (think != null) requestBuilder.think(think);
| * Qwen3 1.7b | ||
| */ | ||
| QWEN_3_1_7_B("qwen3:1.7b"), | ||
| 
               | 
          ||
| /** | ||
| * Qwen3 0.6b | ||
| */ | ||
| QWEN_3_06B("qwen3:0.6b"), | ||
| 
               | 
          
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Among the existing constants, there is QWEN3_4B, while the newly added ones are QWEN_3_1_7_B and QWEN_3_06B. The naming styles are inconsistent (such as the use of underscores and the separation of numbers and units). It is recommended to adopt a unified style (for example, using QWEN3_1_7B / QWEN3_0_6B consistently or applying a uniform underscore scheme) to ensure readability and searchability.
| */ | ||
| @Nullable | ||
| default Boolean getThink() { | ||
| return false; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics of Ollama are: when not explicitly specified (null) → the server embeds the thought process within the  tags in the content; explicitly false → thinking is disabled; explicitly true → separated into message.thinking.
The current ChatOptions interface provides a default implementation: default Boolean getThink() { return false; }. Although specific implementation classes have overridden this to return nullable values, if any implementation fails to override it in the future, "not set" will be treated as false, thereby altering the default behavior.
Fixes #3383
As mentioned in issue: #3383 , Ollama added support for "think" in its latest 0.9.0 version:
https://github.com/ollama/ollama/releases
https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-chat-completion.
This PR implements support for that attribute and includes the following key changes:
thinkfield to Ollama'sChatRequestthinkingfield to Ollama'sMessagethinkproperty toOllamaOptions, allowing users to specify whether to enable or disable thinkingActually, there is currently another issue:
As stated in Ollama's API documentation here, during requests to Ollama, the
messagefield supports sending the model's own reasoning (thoughts) back to it. However,AssistantMessagedoes not currently support transmitting this field, which means the model will not be aware of its previous thoughts.Therefore, perhaps we need to add a specialized
Messageimplementation for Ollama, such asOllamaAssistantMessage. I'm not sure whether this would be considered a significant change.